Performance of Hybrid MPI/OpenMP VASP on Cray XC40 Based on Intel Knights Landing Many Integrated Core Architecture

نویسندگان

  • Zhengji Zhao
  • Martijn Marsman
  • Florian Wende
  • Jeongnim Kim
چکیده

With the recent installation of Cori, a Cray XC40 system with Intel Xeon Phi Knights Landing (KNL) many integrated core (MIC) architecture, NERSC is transitioning from the multi-core to the more energy-efficient many-core era. The developers of VASP, a widely used materials science code, have adopted MPI/OpenMP parallelism to better exploit the increased on-node parallelism, wider vector units, and the high bandwidth on-package memory (MCDRAM) of KNL. To achieve optimal performance, KNL specifics relevant for the build, boot and run time setup must be explored. In this paper, we will present the performance analysis of representative VASP workloads on Cori, focusing on the effects of the compilers, libraries, and boot/run time options such as the NUMA/MCDRAM modes, HyperThreading, huge pages, core specialization, and thread scaling. The paper is intended to serve as a KNL performance guide for VASP users, but it will also benefit other KNL users.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Porting of the DBCSR Library for Sparse Matrix-Matrix Multiplications to Intel Xeon Phi Systems

Multiplication of two sparse matrices is a key operation in the simulation of the electronic structure of systems containing thousands of atoms and electrons. The highly optimized sparse linear algebra library DBCSR (Distributed Block Compressed Sparse Row) has been specifically designed to efficiently perform such sparse matrix-matrix multiplications. This library is the basic building block f...

متن کامل

Optimizing Cray MPI and SHMEM Software Stacks for Cray-XC Supercomputers based on Intel KNL Processors

HPC applications commonly use Message Passing Interface (MPI) and SHMEM programming models to achieve high performance in a portable manner. With the advent of the Intel MIC processor technology, hybrid programming models that involve the use of MPI/SHMEM along with threading models (such as OpenMP) are gaining traction. However, most current generation MPI implementations are not poised to off...

متن کامل

Evaluating the networking characteristics of the Cray XC-40 Intel Knights Landing-based Cori supercomputer at NERSC

There are many potential issues associated with deploying the Intel Xeon PhiTM (code named Knights Landing (KNL)) manycore processor in a large-scale supercomputer. One in particular is the ability to fully utilize the high-speed communications network, given the serial performance of a Xeon PhiTM core is a fraction of a Xeon® core. In this paper we take a look at the tradeoffs associated with ...

متن کامل

Estimating the Performance Impact of the MCDRAM on KNL Using Dual-Socket Ivy Bridge Nodes on Cray XC30

NERSC is preparing for its next petascale system, named Cori, a Cray XC system based on the Intel KNL MIC architecture. Each Cori node will have 72 cores (288 threads), 512 bit vector units, and a low capacity (16GB) and high bandwidth (~5x DDR4) on-package memory (MCDRAM or HBM). To help applications get ready for Cori, NERSC has developed optimization strategies that focus on the MPI+OpenMP p...

متن کامل

Long-time simulations with complex code using multiple nodes of Intel Xeon Phi Knights Landing

Modern partial differential equation (PDE) models across scientific disciplines require sophisticated numerical methods resulting in complex codes as well as large numbers of simulations for analysis like parameter studies and uncertainty quantification. To evaluate the behavior of the model for sufficeintly long times, for instance, to compare to laboratory time scales, often requires long-tim...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017